Stream fusion for multi-stream automatic speech recognition

نویسندگان

  • Hesam Sagha
  • Feipeng Li
  • Ehsan Variani
  • José del R. Millán
  • Ricardo Chavarriaga
  • Björn W. Schuller
چکیده

Multi-stream automatic speech recognition (MSASR) has been confirmed to boost the recognition performance in noisy conditions. In this system, the generation and the fusion of the streams are the essential parts and need to be designed in such a way to reduce the effect of noise on the final decision. This paper shows how to improve the performance of the MS-ASR by targeting two questions; (1) How many streams are to be combined, and (2) how to combine them. First, we propose a novel approach based on stream reliability to select the number of streams to be fused. Second, a fusion method based on Parallel Hidden Markov Models is introduced. Applying the method on two datasets (TIMIT and RATS) with different noises, we show an improvement of MS-ASR.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DBN based multi-stream models for speech

We propose dynamic Bayesian network (DBN) based synchronous and asynchronous multi-stream models for noise-robust automatic speech recognition. In these models, multiple noise-robust features are combined into a single DBN to obtain better performance than any single feature system alone. Results on the Aurora 2.0 noisy speech task show significant improvements of our synchronous model over bot...

متن کامل

Using the Multi Stream Approach for Continuous Audio Visual Speech Recognition Experiments on the M Vts Database

The Multi Stream automatic speech recognition approach was investigated in this work as a framework for Au dio Visual data fusion and speech recognition This method presents many potential advantages for such a task It particularly allows for synchronous decoding of continuous speech while still allowing for some asynchrony of the visual and acoustic information streams First the Multi Stream f...

متن کامل

Ensemble Feature Selection for Multi-Stream Automatic Speech Recognition

Ensemble Feature Selection for Multi-Stream Automatic Speech Recognition

متن کامل

Recognition using speech synthesis : a reactive dynamic for robust ASR

Automatic Speech Recognition (ASR) systems are not efficient under noisy speech. In the Multi-Stream (MS) approach, commonly used to reinforce ASR robustness, each stream feeds one recognizer generating estimates which are combined through a fusion process. As some streams are optimal for transmission of some phonemes [1,3], it is then interesting to over weight the best stream during the featu...

متن کامل

Adaptive Audio-visual Speech Recognition in the Presence of Audio and Video Distortions

Audio-visual speech recognition leads to significant improvements compared to pure audio recognition especially when the audio signal is corrupted by noise. In this article we investigate the consequences of additional degradations in the video signal on the audio-visual recognition process.. We degrade the images with noise, a JPEG compression, and errors in the localization of the mouth regio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • I. J. Speech Technology

دوره 19  شماره 

صفحات  -

تاریخ انتشار 2016